Autonomous Driving with Deep Reinforcement Learning
Overview
Trained an autonomous vehicle with deep reinforcement learning (DDPG) to complete laps on an unknown track while avoiding randomly placed obstacles, then extended it with a CNN-augmented variant to capture the spatio-temporal surroundings.
Key points
- Used DDPG (actor-critic, off-policy) with Ornstein-Uhlenbeck exploration noise; minibatch size 64.
- Built the environment in V-REP — Bezier-spline tracks with obstacles randomized every episode to prevent overfitting.
- State = 20 proximity sensors stacked ×3 (60 values) + 5 inputs (centerline/boundary distance, track angle, x/y velocity); action = tanh throttle/brake + Ackermann steering.
- Reward = R(speed) − [crash penalty + LiDAR-proximity penalty].
- Trained ~4,000 episodes (~300k steps); across 50 tests DDPG averaged 1,210 m vs. CNN-DDPG 1,380 m before collision (+14%), with CNN-DDPG more stable.
Figures
/autonomous.gif)
/autonomous2.gif)